Automatic Assessment of Vocabulary Usage Without Negative Evidence

نویسندگان

  • Claudia Leacock
  • Martin Chodorow
چکیده

This report describes the implementation and evaluation of an automated statistical method for assessing an examinee's use of vocabulary words in constructed responses. The grammatical error-detection system, ALEK (Assessing Lexical Knowledge), infers negative evidence from the low frequency or absence of constructions in 30 million words of wellformed, copy-edited text from North American newspapers. ALEK detects two types of errors: those that violate basic principles of English syntax (e.g., agreement errors as in a desks) and those that show a lack of information about a specific word (e.g., treating a mass noun as a count noun in a pollution). The system evaluated word usage in essay-length responses to Test of English as a Foreign Language (TOEFL) prompts. ALEK was developed using three words and was evaluated on an additional 20 words that appeared frequently in TOEFL essays and in a university word list. System accuracy was evaluated to investigate its potential for scoring performance-based measures of communicative competence. It performed with about 80% precision and 20% recall. False positives (correct usages that ALEK identified as errors) and misses (usage errors that were not recognized by ALEK) were analyzed, and methods for improving system performance were outlined.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Diagnosing L2 Receptive Vocabulary Development Using Dynamic Assessment: A Microgenetic Study

The present study is an attempt to shed light on the effect of Dynamic Assessment (DA) on diagnosing and developing the receptive vocabulary abilities of upper-intermediate learners learning English as a foreign language. Fifty L2 leaners participated in the First Certificate in English test and completed Vocabulary Knowledge Scale. Out of 50 students, ten learners who were identified as being ...

متن کامل

A Specialized WFST Approach for Class Models and Dynamic Vocabulary

In this paper we describe a specialized Weighted Finite State Transducer (WFST) framework for handling class language models and dynamic vocabulary in automatic speech recognition. The proposed framework has several important features, a fused composition algorithm that substantially reduces the memory usage in comparison to generic WFST operations, and an efficient dynamic vocabulary scheme th...

متن کامل

Automatic diagnostic and assessment procedures for the comparison and optimisation of time encoded speech (TES) DVI systems

The development of simple automatic diagnostic and assessment procedures for the comparison and optimisation of TES-based DVI systems is presented. The use of a "Diagnostic Matrix" [1] to assess the variability of input acoustic events purporting to be the same word and to indicate the degree of orthogonality between the acoustic events which form the vocabulary under examination, is discussed....

متن کامل

Semi-Automatic Acoustic Model Generation from Large Unsynchronized Audio and Text Chunks

In this paper an effective technique to train an acoustic model from large and unsynchronized audio and text chunks is presented. Given such a speech corpus, an algorithm to automatically segment each chunk into smaller fragments and to synchronize those to the corresponding text is defined. These smaller fragments are more suitable to be used in standard model training algorithms for usage in ...

متن کامل

Vocab.at – Automatic Linked Data Documentation And Vocabulary Usage Analysis

A growing number of Linked Data is being published as RDF data dumps, as RDFa embedded in HTML pages, and via SPARQL endpoints. Unfortunately, the data available is often poorly documented and the consistency of the datasets is unknown. Gaining an understanding of whether a dataset qualifies for the intended use can then be very time consuming and impede the re-use of the data. When considering...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001